Search CORE

13 research outputs found

Visualization Tools for Comparative Genomics applied to Convergent Evolution in Ash Trees

Author: Seaman Josiah
Publication venue: 'Queen Mary University of London'
Publication date: 01/01/2021
Field of study

Assembly and analysis of whole genomes is now a routine part of genetic research, but effective tools for the visualization of whole genomes and their alignments are few. Here we present two approaches to allow such visualizations to be done in an efficient and user-friendly manner. These allow researchers to spot problems and patterns in their data and present them effectively. First, FluentDNA is developed to tackle single full genome visualization and assembly tasks by representing nucleotides as colored pixels in a zooming interface. This enables users to identify features without relying on algorithmic annotation. FluentDNA also supports visualizing pairwise alignments of wellassembled whole genomes from chromosome to nucleotide resolution. Second, Pantograph is developed to tackle the problem of visualizing variation among large numbers of whole genome sequences. This uses a graph genome approach, which addresses many of the technical challenges of whole genome multiple sequence alignments by representing aligned sequences as nodes which can be shared by many individuals. Pantograph is capable of scaling to thousands of individuals and is applied to SARS and A. thaliana pangenomes. Alongside the development of these new genomics tools, comparative genomic research was undertaken on worldwide species of ash trees. I assembled 13 ash genomes and used FluentDNA to quality check the results and discovered contaminants and a mitochondrial integration. I annotated protein coding genes in 28 ash assemblies and aligned their gene families. Using phylogenetic analysis, I identified gene duplications that likely occurred in an ancient whole genome duplication shared by all ash species. I examined the fate of these duplicated genes, showing that losses are concentrated in a subset of gene families more often than predicted by a null model simulation. I conclude that convergent evolution has occurred in the loss and retention of duplicated genes in different ash species.BBSRC BB/S004661/

Shared Research Repository

Queen Mary Research Online

Skittle: A 2-Dimensional Genome Visualization Tool

Author: Birney
D Sussillo
E Lieberman-Aiden
EN Trifonov
EN Trifonov
G Benson
GM Weinstock
GS Baldwin
I López-Villaseñor
J Sánchez
JF Canny
John C Sanford
Josiah D Seaman
M Costantini
MB Gerstein
MK Rudd
P Schieg
S Kurtz
X She
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background It is increasingly evident that there are multiple and overlapping patterns within the genome, and that these patterns contain different types of information - regarding both genome function and genome history. In order to discover additional genomic patterns which may have biological significance, novel strategies are required. To partially address this need, we introduce a new data visualization tool entitled Skittle. Results This program first creates a 2-dimensional nucleotide display by assigning four colors to the four nucleotides, and then text-wraps to a user adjustable width. This nucleotide display is accompanied by a "repeat map" which comprehensively displays all local repeating units, based upon analysis of all possible local alignments. Skittle includes a smooth-zooming interface which allows the user to analyze genomic patterns at any scale. Skittle is especially useful in identifying and analyzing tandem repeats, including repeats not normally detectable by other methods. However, Skittle is also more generally useful for analysis of any genomic data, allowing users to correlate published annotations and observable visual patterns, and allowing for sequence and construct quality control. Conclusions Preliminary observations using Skittle reveal intriguing genomic patterns not otherwise obvious, including structured variations inside tandem repeats. The striking visual patterns revealed by Skittle appear to be useful for hypothesis development, and have already led the authors to theorize that imperfect tandem repeats could act as information carriers, and may form tertiary structures within the interphase nucleus.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

The khmer software package: enabling efficient nucleotide sequence analysis [version 1; referees: 2 approved, 1 approved with reservations]

Author: Alameldin Hussien F.
Awad Sherine
Boucher Elmar
Brown C. Titus
Caldwell Adam
Cartwright Reed
Charbonneau Amanda
Constantinides Bede
Crusoe Michael R.
Edvenson Greg
Fay Scott
Fenton Jacob
Fenzl Thomas
Fish Jordan
Garcia-Gutierrez Leonor
Garland Phillip
Gluck Jonathan
González Iván
Guermond Sarah
Guo Jiarong
Gupta Aditi
Herr Joshua R.
Howe Adina
Hyer Alex
Härpfer Andreas
Irber Luiz
Kidd Rhys
Lin David
Lippi Justin
Mansour Tamer
McA\u27Nulty Pamela
McDonald Eric
Mizzi Jessica
Murray Kevin D.
Nahum Joshua R.
Nanlohy Kaben
Nederbragt Alexander Johan
Ortiz-Zuazaga Humberto
Ory Jeramia
Pell Jason
Pepe-Ranney Charles
Russ Zachary N.
Schwarz Erich
Scott Camille
Seaman Josiah
Sievert 38 Scott
Simpson Jared
Skennerton Connor T.
Spencer James
Srinivasan Ramakrishnan
Standage Daniel
Stapleton James A.
Stein Joe
Steinman Susan R.
Taylor Benjamin
Trimble Will
Wiencko Heather L.
Wright Michael
Wyss Brian
Zhang Qingpeng
zyme en
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 15/10/2015
Field of study

The khmer package is a freely available software library for working efficiently with fixed length DNA words, or k-mers. khmer provides implementations of a probabilistic k-mer counting data structure, a compressible De Bruijn graph representation, De Bruijn graph partitioning, and digital normalization. khmer is implemented in C++ and Python, and is freely available under the BSD license at https://github.com/dib-lab/khmer/

The khmer software package: enabling efficient nucleotide sequence analysis

Author: Alameldin Hussein F.
Awad Sherine
Boucher Elmar
Brown C. Titus
Caldwell Adam
Cartwright Reed
Charbonneau Amanda
Constantinides Bede
Crusoe Michael R.
Edvenson Greg
Fay Scott
Fenton Jacob
Fenzl Thomas
Fish Jordan
Garcia-Gutierrez Leonor
Garland Phillip
Gluck Jonathan
González Iván
Guermond Sarah
Guo Jiarong
Gupta Aditi
Herr Joshua R.
Howe Adina C.
Hyer Alex
Härpfer Andreas
Irber Luiz
Kidd Rhys
Lin David
Lippi Justin
Mansour Tamer
McA\u27Nulty Pamela
McDonald Erin
Mizzi Jessica
Murray Kevin D.
Nahum Joshua R.
Nanlohy Kaben
Nederbragt Alexander Johan
Ortiz-Zuazaga Humberto
Ory Jeramia
Pell Jason
Pepe-Ranney Charles
Russ Zachary N.
Schwarz Erich
Scott Camille
Seaman Josiah
Sievert Scott
Simpson Jared
Skennerton Connor T.
Spencer James
Srinivasan Ramakrishnan
Standage Daniel S.
Stapleton James A.
Stein Joe
Steinman Susan R.
Taylor Benjamin
Tremble Will
Wiencko Heather L.
Wright Michael
Wyss Brian
Zhang Qingpeng
zyme en
Publication venue: Iowa State University Digital Repository
Publication date: 25/09/2015
Field of study

Digital Repository @ Iowa State University (ISU)

PubMed Central

eScholarship - University of California

Recommended from our members

Pangenome Graphs.

Author: Chang Xian
Ebler Jana
Eizenga Jordan
Garg Shilpa
Garrison Erik
Ghaffaari Ali
Heumos Simon
Hickey Glenn
Marschall Tobias
Novak Adam
Paten Benedict
Rautiainen Mikko
Rounthwaite Robin
Seaman Josiah
Sibbesen Jonas
Sirén Jouni
Publication venue: eScholarship, University of California
Publication date: 31/08/2020
Field of study

Low-cost whole-genome assembly has enabled the collection of haplotype-resolved pangenomes for numerous organisms. In turn, this technological change is encouraging the development of methods that can precisely address the sequence and variation described in large collections of related genomes. These approaches often use graphical models of the pangenome to support algorithms for sequence alignment, visualization, functional genomics, and association studies. The additional information provided to these methods by the pangenome allows them to achieve superior performance on a variety of bioinformatic tasks, including read alignment, variant calling, and genotyping. Pangenome graphs stand to become a ubiquitous tool in genomics. Although it is unclear whether they will replace linearreference genomes, their ability to harmoniously relate multiple sequence and coordinate systems will make them useful irrespective of which pangenomic models become most common in the future

eScholarship - University of California

A high‐quality reference genome for Fraxinus pennsylvanica for ash species restoration and research.

Author: Best Teodora
Buggs Richard
Carlson John E.
Cooper Endymion
Faridi Nurul
Huff Matt
Kelly Laura J.
Koch Jennifer
Nelson Charles D.
Romero Severson Jeanne
Seaman Josiah
Staton Margaret
Steiner Kim
Wu Di
Zhebentyayeva Tetyana
Publication venue: Wiley
Publication date: 27/11/2021
Field of study

Green ash (Fraxinus pennsylvanica) is the most widely distributed ash tree in North America. Once common, it has experienced high mortality from the non‐native invasive emerald ash borer (EAB; Agrilus planipennis). A small percentage of native green ash trees that remain healthy in long‐infested areas, termed “lingering ash,” display partial resistance to the insect, indicating that breeding and propagating populations with higher resistance to EAB may be possible. To assist in ash breeding, ecology and evolution studies, we report the first chromosome‐level assembly from the genus Fraxinus for F. pennsylvanica with over 99% of bases anchored to 23 haploid chromosomes, spanning 757 Mb in total, composed of 49.43% repetitive DNA, and containing 35,470 high‐confidence gene models assigned to 22,976 Asterid orthogroups. We also present results of range‐wide genetic variation studies, the identification of candidate genes for important traits including potential EAB‐resistance genes, and an investigation of comparative genome organization among Asterids based on this reference genome platform. Residual duplicated regions within the genome probably resulting from a recent whole genome duplication event in Oleaceae were visualized in relation to wild olive (Olea europaea var. sylvestris). We used our F. pennsylvanica chromosome assembly to construct reference‐guided assemblies of 27 previously sequenced Fraxinus taxa, including F. excelsior. Thus, we present a significant step forward in genomic resources for research and protection of Fraxinus species

Shared Research Repository

PubMed Central